Uncertain Data Integration Using Functional Dependencies

نویسندگان

  • Naser Ayat
  • Hamideh Afsarmanesh
  • Reza Akbarinia
  • Patrick Valduriez
چکیده

Data integration systems are crucial for applications that need to provide a uniform interface to a set of autonomous and heterogeneous data sources. However, setting up a full data integration system for many application contexts, e.g. web and scientific data management, requires significant human effort which prevents it from being really scalable. In this paper, we propose IFD (Integration based on Functional Dependencies), a pay-as-you-go data integration system that allows integrating a given set of data sources, as well as incrementally integrating additional sources. IFD takes advantage of the background knowledge implied within functional dependencies for matching the source schemas. Our system is built on a probabilistic data model that allows capturing the uncertainty in data integration systems. Our performance evaluation results show significant performance gains of our approach in terms of recall and precision compared to the baseline approaches. They confirm the importance of functional dependencies and also the contribution of using a probabilistic data model in improving the quality of schema matching. The analytical study and experiments show that IFD scales well.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Uncertain Data Integration System

Data integration systems offer uniform access to a set of autonomous and heterogeneous data sources. An important task in setting up a data integration system is to match the attributes of the source schemas. In this paper, we propose a data integration system which uses the knowledge implied within functional dependencies for matching the source schemas. We build our system on a probabilistic ...

متن کامل

Probabilistic XML functional dependencies based on possible world model

With the increase of uncertain data in many new applications, such as sensor network, data integration, web extraction, etc., uncertainty both in relational databases and XML datasets has attracted more and more research interests in recent years. As functional dependencies (FDs) are critical and necessary to schema design and data rectification in relational databases and XML datasets, it is a...

متن کامل

Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows

Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...

متن کامل

Integration of Web Sources Under Uncertainty and Dependencies Using Probabilistic XML

We explore in this paper the problem of integrating several web data sources under uncertainty and dependencies. We present a real application of this from web sources about objects in the maritime domain where uncertainties and dependencies are ubiquitous. Uncertainties are mainly caused by imprecise data trackers and imperfect human knowledge whereas dependencies come from the frequent copyin...

متن کامل

Pay-As-You-Go Data Integration Using Functional Dependencies

Setting up a full data integration system for many application contexts, e.g. web and scientific data management, requires significant human effort which prevents it from being really scalable. In this paper, we propose IFD (Integration based on Functional Dependencies), a pay-as-you-go data integration system that allows integrating a given set of data sources, as well as incrementally integra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017